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1 WHAT IS CLAIMED IS: 

LA method to be adopted in systems utilising time/freq grid coding of audio sig 
nals, characterised by 

(a) Deriving the start time border from the end time border of the previous 
frame of envelope data; 

(b) Detecting the most drastic transient time slot with a transient detecto 
r in the spectral data between the said start time border and the furthest allow 
ed end time border; 

(c) Finding and instantiating an actual end time border and intermediate ti 
me borders in the spectral data between the said transient time slot and the fur 
thest allowed end time border by evaluating a signal variation criterion; 

(d) Deriving the frequency resolution by evaluating the energy of every fre 
quency band flanked by the low-resolution borders for every time segment obtaine 
d above", 

2. A method according to 1, characterised in that if the number of borders allowe 
d has been exhausted but the end time border found does not satisfy a minimum re 
quired value, expands the said intermediate borders until the minimum required v 
alue is attained; 

3. A method according to 1, characterised in that more intermediate time borders 
can be instantiated in the spectral data between the transient time slot and the 

start time border by evaluating the said signal variation criterion, if the num 
ber of borders allowed has not been exhausted; 

4. A method according to 1, characterised in that the said process of finding an 
intermediate time border first defines a temporary time segment with the previou 
sly found time border and a moving time border which moves progressively away 
from the said previous time border, and then evaluates the said signal variation 

criterion for every move the said moving time border makes. 

5. A method according to 1, characterised in that the said signal variation crite 
rion is the ratio between the minimum energy of a time slot within the said temp 
orary time segment and the average energy of the said temporary time segment. 

6. A method according to 5, characterised in that if the said computed ratio exce 
eds a threshold, a new intermediate or end border is instantiated according to t 
he said moving time border to define a new time segment. 

7. A method according to 2, characterised in that the said expansion of borders c 
an occur to the time segment furthest away from the transient time slot within t 
he said frame first, and time segments nearer to the transient time slot are con 
sidered only when the expansion of the further border has reached its syntactic 
limit. The said expansion of borders can also try to increase every time segment 

check the signal characteristics of the new time segment formed, and applies t 
he actual increase to the time segment that causes the least overall increase in 
between-border signal variations. 
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8. A method according to 1, characterised in that the said energy evaluation comp 
utes the ratios between the energies of the frequency bands for every time segme 
nt found. If the minimum of the ratios exceeds a threshold, a high frequency res 
olution is adopted; Otherwise, a low frequency resolution is adopted. 

9. A method according to 8, characterised in that the said threshold is higher in 
a plurality of time segments immediately following the transient time border, t 

o make it more difficult to switch to high frequency resolution in the said regi 
on. 

10. A method to be adopted in bandwidth extension strategies utilising the said t 
ime/freq grid coding approach, where an analysis filterbank transforms an audio 
signal into a plurality of low-frequency subband signals, where portions of the 
said subband signals are replicated to the high-frequency region, where the said 

replicated subbands are divided into time segments using the said time borders 
information and subsequently into frequency bands using the said frequency resol 
utions information and subsequently modulated by the said envelope data, where a 

synthesis filterbank transforms the said low-frequency subband signals and the 
said envelope-adjusted subband signals into a bandwidth-extended, time-domain si 
gnal, characterised by 

(a) Deriving the start time border from the end time border of the previous 

frame of envelope data; 

(b) Detecting the most drastic transient time slot with a transient detecto 
r in the spectral data between the said start time border and the furthest allow 
ed end time border; 

(c) Finding and instantiating an actual end time border and intermediate ti 
me borders in the spectral data between the said transient time slot and the fur 
thest allowed end time border by evaluating a signal variation criterion; 

(d) Deriving the frequency resolution by evaluating the energy of every fre 
quency band flanked by the low-resolution borders for every time segment obtaine 
d above; 

11. A method to be adopted in systems utilising time/freq grid coding of audio si 
gnals, characterised by 

(a) Deriving the start time border from the end time border of the previous 

frame of envelope data; 

(b) Detecting the most drastic transient time slot with a transient detecto 
r in the spectral data between the said start time border and the furthest allow 
ed end time border; 

(c) Detecting which of the regions, one between the transient border and th 
e start time border, another between the transient border and the furthest allow 
ed end time border, has the most varying spectral data; 

(d) If the said most varying spectral data is found in the region between t 
he transient border and the furthest allowed end time border, finding and instan 
tiating an actual end time border and intermediate time borders in the said regi 
on by evaluating a signal variation criterion; 

(e) If the said most varying spectral data is found in the region between t 
he transient border and the start time border, finding and instantiating interme 
diate borders in the said region by evaluating a signal variation criterion, the 
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n finding and instantiating an actual end time border and intermediate time bord 
ers in the other region by evaluating a signal variation criterion; 

(f)Deriving the frequency resolution by evaluating the energy of every fre 
quency band flanked by the low-resolution borders for every time segment obtaine 
d above; 

12. A method to be adopted in bandwidth extension strategies utilising the said t 
ime/freq grid coding approach, where an analysis filterbank transforms an audio 
signal into a plurality of low-frequency subband signals, where portions of the 
said subband signals are replicated to the high-frequency region, where the said 

replicated subbands are divided into time segments using the said time borders 
information and subsequently into frequency bands using the said frequency resol 
utions information and subsequently modulated by the said envelope data, where a 

synthesis filterbank transforms the said low-frequency subband signals and the 
said envelope-adjusted subband signals into a bandwidth-extended, time-domain si 
gnal, characterised by 

(a) Deriving the start time border from the end time border of the previous 
frame of envelope data; 

(b) Detecting the most drastic transient time slot with a transient detecto 
r in the spectral data between the said start time border and the furthest allow 
ed end time border; 

(c) Detecting which of the regions, one between the transient border 

and the start time border, another between the transient border and the furthe 
st allowed end time border, has the most varying spectral data; 

(d) If the said most varying spectral data is found in the region between t 
he transient border and the furthest allowed end time border, finding and instan 
tiating an actual end time border and intermediate time borders in the said regi 
on by evaluating a signal variation criterion; 

(e) If the said most varying spectral data is found in the region between t 
he transient border and the start time border, finding and instantiating interme 
diate borders in the said region by evaluating a signal variation criterion, the 
n finding and instantiating an actual end time border and intermediate time bord 
ers in the other region by evaluating a signal variation criterion; 

(f) Deriving the frequency resolution by evaluating the energy of every fre 
quency band flanked by the low-resolution borders for every time segment obtaine 
d above; 

13. Software coded in programming language that provides a function achieved by 
the determination method according to claim 1 to 12. 

14. A data recording medium for storing the software according to claim 13. 
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2 TITLE OF THE INVENTION 

Method for Determining Time Borders and Frequency Resolutions for Spectral Envelope 
Coding 

3 DETAILED DESCRIPTION OF THE INVENTION 



3.1 Industrial Field of Utilisation 

This invention introduces a systematic segmentation method to determine the time borders 
and frequency resolution for bandwidth extension technologies that employ a subband coding 
strategy, such as the Spectral Band Replication (SBR) technology. 

3.2 Background and Prior Art 

The objective of audio coding is to transform a digitised audio stream into a compressed 
representation (or bitstream) at the audio encoder, so that as high fidelity to the original 
source as possible is retained after the bitstream is processed at the decoder. One popular way 
of compression is shown in Figure 1, which shows a typical audio coding system comprising 
an encoder and a decoder. Module 1000 divides the audio signal in time domain into 
consecutive frames, module 1010 transforms each frame of audio signal into frequency 
domain and module 1020 quantizes the spectrum up to a certain frequency (known as the 
bandwidth) at the encoder. One possible way for module 1010 to transform the audio signal 
into frequency domain is the time/frequency grid approach as shown in Figure 18, where a 
filterbank is employed to split an audio signal into multiple subbands, each representing a 
portion of the signal within a narrow frequency range in time domain. At the decoder, the 
audio spectrum is de-quantized by module 1030 and inverse-transformed by module 1040 
back into audio frames. The audio frames are then appropriately assembled by module 1050 
to form a continuous audio stream. 

As the bitrate (number of bits per second) of- coding decreases, more sacrifice has to be made 
to the bandwidth by not coding the high-frequency portion, as it is deemed not as 
perceptually important as the low frequency portion. The consequence is that some high- 
frequency tones, and harmonics of the low-frequency tones are shut down. Figure 2 illustrates 
the above band-limiting operation, where 2020 indicates the resultant bandwidth of the coded 
audio. 

The objective of bandwidth extension is to recover the high-frequency portion, by coding 
them using very few additional bits. One example of such a technique is the Spectral Band 
Replication (SBR) method (International Patent Publication W098/57436), which is now an 
MPEG standard (ISO/IEC 14496-3, 2001 AMD1). Figure 3 illustrates one possible encoder 
structure for SBR that is relevant to this invention. At the outset, the audio signal is band- 
splitted into N subbands with N subband filters at the 'analysis filterbank' 3010, each 
capturing a part of the signal's frequency spectrum. The N signals produced by the filters are 
decimated to remove redundancy. The bandwidth extension coder 3020 extracts some 
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information from the filter outputs so that at the decoder, the low-frequency subbands can use 
the information to extend the bandwidth of the audio signal. The bandwidth extension 
information is then multiplexed at 3030 with the output of the core audio encoder 3000 to 
form a bitstream. A nominal SBR frame consists of L outputs from each subband filter. 

Figure 4 illustrates the decoder for the SBR method that is relevant to this invention. At the 
outset, a bitstream is demultiplexed at 4000 to become the core audio bitstream and the 
bandwidth extension bitstream. The core decoder 4010 decodes the core audio bitstream to 
produce the band-limited audio signal in time domain. The band-limited audio signal is then 
band-splitted into M subbands with M subband filters of the 'analysis filterbank' 4020. 
Higher-frequency subbands are synthesized using the bandwidth extension information at this 
subband level. The new higher-frequency subbands, as well as the lower-frequency subbands, 
are up-sampled and assembled with an AT-filter 'synthesis filterbank' 4040 to output the final 
bandwidth-extended signal. 

The output from the analysis filterbank 3010 can be viewed as a time/frequency grid 
representation of the audio signal as shown in Figure 18. As part of the bandwith extension 
information, the time frequency representation is to be divided first in the time direction into 
'time segments' and then in the frequency direction into 'frequency bands'. For each 
frequency band, its average energy is computed, quantized and coded. This process is known 
as spectral envelope coding. Figure 5 illustrates such a segmentation process, and is fully 
described in International Patent Publication WOO 1/26095 Al. In the figure, 5010 depicts 
segmentation in the time direction, and 5020 depicts segmentation in the frequency direction 
At the decoder, the data generated by this process is used to shape the energy of the 
synthesised high-frequency bands, so that it takes on the same energy envelope as the original 
audio signal. Without proper segmentation, low-energy areas would be forced to share the 
same average energy value as the large-energy areas. This would in turn lead to erroneous 
amplification at the decoder, which is a common source of audible artefacts. 

Each SBR frame is partitioned in the time direction into time segments using 'borders'. The 
prior art describes the method of using 'fixed' and 'variable' borders to achieve effective 
spectral envelope coding. Refer to Figure 6, the fixed borders 6060, 6070 and 6100 coincide 
with the borders 6010, 6020 and 6050 of the nominal SBR frames, whereas the variable 
borders 6080 and 6090 of the current frame is allowed to encroach into the next nominal SBR 
frame. The start border and the end border of a Variable SBR frame' can either be a fixed 
border or a variable border. If the start border and end border are both fixed borders, the 
variable SBR frame coincides with the nominal SBR frame. The end border of the current 
SBR frame automatically becomes the start border of the next SBR frame. 

Between the start border and end border, the SBR frame is further partitioned into several 
time segments by intermediate borders according to the prior art. If the start border and end 
border are both fixed borders, the SBR frame is partitioned into uniform time segments. This 
is known as the FDCF1X frame in the prior art (i.e a FIX border as the start border and a FIX 
border as the end border), and is depicted in Figure 7, where 7010 is the start border and 7020 
is the end border. If a threshold detector finds a transient region in the current SBR frame, its 
end border will become a 'variable' border that must be equal to or greater than the next 
nominal SBR frame. This is the so-called FIXVAR frame shown in Figure 8. It has a FIX 
border as the start border 8010 and a VAR border as the end border 8050. The intermediate 
borders 8020, 8030 and 8040 are specified relative to one another or the variable border, 
where d 0> dj 9 d 2 * etc are the relative border distances. According to Figure 8, the first relative 
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distance do must start with the variable border. Subsequent relative distances start with the 
previously determined intermediate borders. 

Since the end border of the current SBR frame automatically becomes the start border of the 
next SBR frame, it's possible for an SBR frame to have two variable borders in case of 
transient behaviours in successive SBR frames. This is known as the VARVAR frame as 
illustrated in Figure 9. For a VARVAR frame, the intermediate borders can be specified as 
relative to either one of the variable borders. In the figure, intermediate border 9020 is 
relative to start border 9010, whereas intermediate borders 9030, 9040, 9050 are relative to 
each other or the variable end border 9060. 

Finally, if the transient detector cannot find any transient in the current SBR frame, but it 
begins with a variable border, it will still adopt a fixed border as its end border. This is the 
final frame class introduced in the prior art and is known as a VARFIX frame, as illustrated 
in Figure 10, where 10010 is a variable start border and 10050 is a fixed end border. 10020, 
10030 and 10040 constitute the intermediate borders progressively derived from do, d\ and d 2 . 

To reduce bit consumption, the relative border distances between an intermediate border and 
a variable border can only take on a few pre-determined sizes. 

After marking a plurality of time segments with the above-mentioned borders, each time 
segment, flanked by two borders, is to be divided in the frequency direction into frequency 
bands. The exact spectral borders are derived using criteria that are irrelevant to this 
invention. Two possible resolutions can be specified: high or low. Figure 1 1 shows the border 
relationship between a high-resolution division and a low-resolution division. The borders of 
the low-resolution divisions are the alternate borders of the high-resolution division. 

33 Problems 

For current SBR frame, upon the determination of the start border based on the end border of 
the previous SBR frame, and the determination of a transient border using a threshold 
detector, a method is needed to determine the end border, and all intermediate borders. 

The problem is not straightforward because, as mentioned, all intermediate borders d t are to 
be specified relative to one another or the variable borders, and all relative distances can only 
take on a few pre-determined sizes, d^{D u D 2 , D 3> D 4 }, with 0<Dj<D 2 <D 3 <D4. Moreover, 
only a syntactically pre-determined number of intermediate borders are permitted. For the 
FIXVAR and VARVAR frame type, the end border must be equal to or greater than the 
nominal SBR border. A systematic method is needed to encompass all constraints imposed. 

The default spectral coding strategy adopted by the prior art resorts to low temporal 
resolution but high spectral resolution (i.e. few time segments but more frequency bands). 
When a transient is detected, the prior art switches to high temporal resolution but low 
spectral resolution (i.e. more time segments but less frequency bands) to code the region after 
the transient. The objective for switching the degrees of resolution is to account for the fact 
that a transient tends to exhibit more temporal variation than spectral variation. Lowering the 
frequency resolution can help curb a sudden surge in bit consumption. However, this method 
is not sufficient if the post-transient region exhibits a high degree of spectral variation that 
warrants a higher resolution, such as the case of a sudden burst of a tonal signal. 
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3*4 Solution to the Problems 

3.4.1 Determination of Time Borders 

To determine the time borders, this invention discloses a systematic method to determine the 
end border and all intermediate borders while taking into account all syntactic constraints 
imposed by the decoder. 

Like in the prior art, the frame type for the current SBR frame is determined according to the 
type of end border of the previous frame, as well as the presence of a transient in the current 
SBR frame. The start border is also determined according to the end border of the previous 
SBR frame. 

For a FIXFIX frame, a low time resolution setting is used. 

For a FDCVAR frame and a VARVAR frame, a search for possible intermediate borders is 
first conducted in the region after the transient time slot. The end border is also determined at 
this stage. Then, another search is conducted in the region before the transient time slot for 
possible intermediate borders, if the first stage hasn't already exhausted the maximum 
number of borders allowed. 

For the VARFEX frame, only one search needs to be conducted, in the whole region flanked 
by a variable start border and a fixed end border. 

All of the above are accomplished with two Forward Search operations and one Backward 
Search operation. They employ the same principle, which is based on evaluating the signal 
variation of a time segment, but with minor variations to suit the scenarios in which they are 
applied. 

3.4.2 Determination of Frequency Resolution 

To determining the frequency resolution, this invention discloses an adaptive method that 
objectively assesses the energy variation in the spectral direction. 

Since the borders of low-resolution division are the alternate borders of high-resolution 
division, a high resolution is first assumed and average energies are computed for each 
frequency band. For every pair of frequency bands flanked by the low-resolution borders, the 
ratio of energies is computed. If the minimum of all energy differences computed for the 
entire time segment exceeds a pre-determined threshold, a high-frequency resolution is 
adopted. Otherwise, a low-frequency resolution is adopted. Noting the importance of giving 
employing high temporal resolution in the post-transient region, the method applies a stricter 
criterion for the adoption of high frequency resolution in this region. 

3.5 Embodiments 

The below methods are examples explained in the context of SBR. However, their 
applicability extends to any embodiments utilizing spectral envelope coding based on 
time/frequency grid. 

3.5. 1 Determination of Time Borders 
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The embodiment for the determination of time borders is presented as a series of flowcharts 
shown in Figure 12-15. 

3.5.1.1 Overview 

Figure 12 shows an overview of the overall time border determination operation. 12010 sets 
the first border border[0] to the end border of the previous SBR frame. It also initialises the 
border counter noBorder to 1. 12020 activates the transient detector for the current frame, to 
check for the most drastic transient behaviour from border[0] to (next nominal SBR border + 
V), where Fis the amount of transgression into the next SBR frame allowed by the syntax. 

If a transient is found, 12030 checks the end border of the previous SBR frame for its type. If 
it's a FIX border, the current frame becomes a FIXVAR type in 12050; If it's a VAR border, 
the current frame becomes a VARVAR type in 12090. In either case, the transient border is 
registered in borderfl j and noBorder is incremented. 

If a transient is not found, 12040 checks the end border of the previous SBR frame for its 
type. If it's a FIX border, the current frame becomes a FTXFDC type in 12130; If it's a VAR 
border, the current frame becomes a VARFIX type in 121 50. 

If the current frame is FIXVAR, 12060 checks the region between the said transient and (next 
nominal SBR border + V) for possible need for intermediate borders. The Forward Search 
(Type I) method to be described in 3.5.1.2 is used for this purpose. At the end of Forward 
Search, noBorder is checked in 12070. If noBorder is found to be below the maximum 
allowed number of borders MaxBorder, 12080 uses a Backward Search method to check the 
region between the said transient and the start border and instantiate more intermediate 
borders if necessary. The above sequence of operations prioritises the post-transient region in 
finding intermediate borders. 

If the current frame is VARVAR, 12100 checks the region between the said transient and 
(next nominal SBR border + V) for possible need for intermediate borders using the same 
Forward Search (Type T) method to be described in 3.5.1.2. At the end of Forward Search, 
noBorder is checked in 12110. If noBorder is found to be below the maximum allowed 
number of borders MaxBorder, 12120 uses another Forward Search method (Type II) to 
check the region between the said transient and the start border and instantiate more 
intermediate borders if necessary. Again, the above sequence of operations prioritises the 
post-transient region in finding intermediate borders. 

If the current frame is FEXFIX, 12140 opts for a low temporal resolution setting. More is 
discussed in 3.5.2. 

If the current frame is VARFK, 12160 checks the region between the start border and the 
next nominal SBR frame border for possible need for intermediate borders. The afore- 
mentioned Forward Search (Type I) method is used for this purpose. 

The four branches of operations culminate in 12170 which sorts the borders generated in 
ascending order for later processing. 
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Figure 17 depicts the employment of the three search types in the four frame types, where 
17010 and 17020 denote the forward search (type I) operation, 17040 and 17050 denote the 
forward search (type II) operation, and 17030 denotes the backward search operation. 

The post-transient region is prioritised in the intermediate border determination process in the 
above embodiment, however, it is also possible to select which of the regions should be 
prioritised by evaluating signal variations. If the signal variation is lager in the pre-transient 
region, the pre-transient region is prioritised, and if it is larger in the post-transient region, 
vice versa. 



3.5.1.2 Forward Search (Type I) 



This Forward Search (Type I) method is designed for a region that starts with a transient and 
ends with a variable border which is yet to be determined. Its objective is to determine the 
intermediate borders and the end border. Three input parameters, border!, border! and 
noBorderLimit must be initialised according to 12060 and 12100 of Figure 12 to delineate the 
search zone (between border 1 and border2\ and the maximum number of borders permitted. 

The flowchart of this method is shown in Figure 13. The method uses two intermediate 
variables i and j to track the left and the right border of a time segment, k is used to index the 
relative border distance D k for the current time segment. 13010 initialises i to border 1 and k 
to 2. 13020 checks whether / is still below the nominal SBR frame border and the noBorder 
hasn't exceeded the noBorder Limit. If the condition is passed, more intermediate borders can 
still be instantiated, so 13030 sets the next possible edge of the current time segment, y, to 
z'+D 2 . 13040 checks j on whether its value exceeds border!. If it does, then D k is not a usable 
relative border distance. The method reverts to the previous relative border distance, Ac-i by 
subtracting 1 from k in 13090 and registering a new border at i+D k . The number of borders is 
updated by incrementing noBorder. If the method arrives at 13100 via the 'no' decision path 
of 13040, then the border just registered would later become the variable end border of this 
SBR frame. 

On the other hand, if 13040 produces a 'yes' decision, it proceeds to evaluate a signal 
variation criterion to find out whether a new border is necessary. However, if I\ is already 
the maximum allowed relative border distance (Z>4 in this example), as reflected in 13050, 
then the signal variation criterion needs not be evaluated as a new border becomes 
compulsory. It would branch directly to 13 100 to register the new border. 

If Ac is not D4 yet, then a variable peakjratio is evaluated in 13060 for the region between i 
and>l. One possible criterion for a new intermediate border can be based on checking the 
ratio of the energy of each time slot to the average energy of the entire time segment. It's 
carried out in 13070 as shown: 

peakjratio = min| : =-| > Tr^ , for i <> m <j-\ 
where, 

ET m is the energy of time slot m 9 

ET is the average energy of all time slots, computed from i to j-1 
Tr x is a pre-determined threshold value. 
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Another possible signal variation criterion can be based on comparing the largest and 
smallest energy as follows: 

largest ET m of all time slots from i to j - 1 « 

peak ratio — — 2 > lr } 

~ smallest ET m of all time slots from i to y - 1 

Lastly, the signal variation criterion can be based on comparing the largest and smallest 
absolute amplitudes as follows: 

largest absolute amplitude of all time slots from / to y-1 ^ T , 

peak ratio = — 2 — > 

smallest absolute amplitude of all time slots from / to j - 1 

lipeakjratio exceeds a threshold 7h, then the large signal variation warrants a new border. 
However, as the ciunrent Ac causes the large signal variation, Ac-i should be the desired 
relative border distance. As a result, the value of k is decremented in 13090 and a new border 
is registered in 13100. 

Ifpeakjratio is not above the threshold Tr u the signal variation is considered fairly even, so a 
larger At is attempted by first incrementing k followed by adjusting j in 13080. 

The process repeats until finally 13020 returns a 'no* decision. It then proceeds to 13110 to 
check whether despite using up all the noBorderLimit, the last border (which would become 
the variable end border) is still below the nominal SBR frame border. This is an important 
consideration because the SBR syntax requires that the end border be equal or greater than 
the nominal SBR frame border. If not the case, the operation safely terminates. If it's the case, 
the method begins a process of expanding the relative border distances until the last border 
satisfies the above requirement. 

One possible method to expand the relative border distances is by sacrificing the relative 
border distance that's the furthest away from the transient border first. Starting from 13120, i 
is initialised to index the last border. 13130 checks the relative border distance between 
border[i] and border[i-l]. If the difference is not less than this relative border distance 
cannot be expanded, so / is decremented so that the relative border distance between 
border[i-l] and border[i-2] is checked subsequently. However, if the difference is below D4, 
the relative distance between border[f\ and border[i~l] is expanded in 13160. The process is 
repeated until the last border is greater or equal to the nominal SBR frame border as verified 
in 13170. 

Another method of expanding the relative border distances is more computationally intensive. 
It tries to increase every relative border distance between borders, check the signal 
characteristics between the new borders, and applies the actual increase to the relative border 
distance that causes the least overall increase in between-border signal variations. Then the 
operation is repeated until the end border becomes equal or greater than the nominal SBR 
frame border. However, from experience, the region that is least varying is also the region 
that is furthest away from the transient border, because if the region near the transient border 
were the most varying, this characteristic would have already been captured by the presence 
of closely spaced intermediate borders near the transient border. 
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3,5.1.3 Forward Search (Type IS) 

This Forward Search (Type II) method is designed for a region that starts with a variable or 
fixed border, and ends with a border that has already been determined, such as the transient 
border or a fixed border. Unlike the Type I method, its objective is to determine the 
intermediate borders only. Three input parameters, border I, border2 and noBorderLimit must 
be initialised according to 12120 and 12160 of Figure 12 to delineate the search zone and the 
maximum number of borders permitted. 

The flowchart of this method is shown in Figure 14. In principle, the two search methods are 
the same. Therefore, operations 14010 to 14100 are almost identical to operations 13010 to 
13100 of Figure 13, with a few exceptions: 

In 14020, instead of checking whether the leading edge of the current time segment is below 
the next nominal SBR frame, the new constraint is for the leading edge to be below border!- 
D 2 . 

If 14020 returns a 'no' decision, the operation terminates. There is no need for the operation 
to expand some relative border distances (i.e. unlike 13110 onwards in Figure 13) because an 
end border needs not be found. 

Similarly, in 14040, if the trailing edge of the current time segment exceeds border2, it 
terminates right away as opposed to registering a new border at (i.e. the branching 

from 13040 to 13090 in Figure 13) as an end border is not necessary. 

In 14100, the peakj-atio of a new border has to be stored when it is instantiated. This is to 
facilitate 141 10, which removes redundant borders. Redundant borders are sometimes created 
because the size allowed for the current time segment has reached a maximum. Since the 
border locations are to be specified relative to each other, this border is necessary if more 
borders are to be created subsequently. However, if this is the last border, it can be removed 
without causing any problem. 

3.5.1.4 Backward Search 

This backward Search method is designed for a region that starts with a transient and ends 
with a start border. Three input parameters, border], border2 and noBorderLimit must be 
initialised according to 12080 of Figure 12 to delineate the search zone and the maximum 
number of borders permitted. 

The flowchart of this method is shown in Figure 15. In principle, the method is the same as 
Forward Search (Type II). Therefore, operations 15010 to 15110 are almost identical to 
operation 14010 to 14110, except that the operations are performed in the reverse direction: 
Instead of incrementing j relative to i, backward searching decrements j relative to /. 
Specifically, 

Instead of i<=border2-D 2 in 14020, there is i>=border2+D 2 in 15020 because i will get 
increasingly closer to the start border (i.e. border2). 

Instead of j<=border2 in 14040, there is j>=border2 in 15040 for the same reason mentioned 
above. 
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Instead of computing peakjratio for time slots / to y-1 in 14060, 15060 computes peakjratio 
for time slots./ to M. 

Instead of computingyW+A, in 14030 and 14080 , there is j=i-Dk in 15030 and 15080. 
Finally, instead of computing /=/+Ac in 14100, there is i=i-I\ \n 15100. 

3.5.2 Low Temporal Resolution for FIXFIX 

A FIXFIX frame has no transient characteristics in its vicinity, so it's logical to use very few 
time borders to save coding bits. For SBR, the time/frequency representation for a FIXFIX 
frame is uniformly divided based on the number of borders chosen. A simple method to 
choose the number of borders is to try out the lowest number of borders and evaluate the 
peak_ratio of the time segments formed. If any of the peakjratio^ exceeds a certain 
threshold, a larger number of borders is tried, and the evaluation of peakjratio for each time 
segment formed is repeated. The process terminates when the peakjratio's of all time 
segments formed are below a threshold, or when the maximum number of borders has been 
reached. 

3.5.3 Determination of Frequency Resolution 

The embodiment for the determination of frequency resolution is illustrated by way of an 
example shown in Figure 16. The borders of low-resolution division are the alternate borders 
of high-resolution division. 

Initially, the average energy for every frequency band in a time segment is computed, 
assuming that a high frequency resolution is adopted. The average energy is denoted by E\. 

If the high frequency resolution is even, then satisfying the following condition will lead to 
the selection of high frequency resolution; Otherwise, the low frequency resolution will be 
selected: 



If the high frequency resolution is odd, then satisfying the following condition will lead to the 
selection of high frequency resolution; Otherwise, the low frequency resolution will be 
selected: 



FREQ _ RES _ THRESHOLD^ , for the first n time segments after a threshold border 







where 




otherwise 
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and FREQ_RESJTHRESHOLD 2 > FREQ RES THRESHOLD^ This implies that for the n 
time segments after a threshold time slot, it's harder to adopt a high frequency resolution 
because a higher time resolution is favoured. 

While the average energy is used for the determination in the above embodiment, any other 
parameter like amplitude information, which represents signal variation, can be used instead. 

3.6 Effects of the Invention 

The time border determination method successfully provides a systematic method to perform 
frame segmentation in the pre- and post transient regions by evaluating the change of energy 
in time. It provides good sound quality by emphasizing the post-transient region over the pre- 
transient region, and the region closest to the onset of transient over the region further away, 
while taking into considerations all syntactic constraints imposed. The adaptive frequency 
resolution determination method helps to check the energy distribution in the frequency 
direction in the post-transient region. It resorts to high-resolution segmentation if a large 
variation in the energy distribution is detected. Together, the two methods of the invention 
realise a good and easily implemented strategy for segmentation of the time-frequency 
representation of SBR technology. 
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Audio Encoder 




Figure 1 : A typical audio coding system 
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4 amplitude 




Figure 2: Limitation of bandwidth owing to bitrate consideration causes a loss of some 

high-frequency tones and harmonics 
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Figure 3 : A possible encoder of a subband coding scheme for bandwidth extension 
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Figure 6: Border relationships between four frame types: FIXFIX, FIXVAR, VARFIX, 

VARVAR. 
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time 



Figure 7: 



— ^ Time border 



A FIXFIX frame with fixed start and end borders. 
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frequency 



FIXVAR 




time 



^ Time border 



A FIXVAR frame with a fixed start border, a variable end border greater than 
the nominal SBR frame border, and some intermediate borders specified 
relative to the end border or each other. 
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[IS) 9] 



frequency 



VARVAR 




Figure 9: A VARVAR frame with a variable start border, a variable end border greater 
than the nominal SBR frame border, and some intermediate borders specified 
relative to the two end borders or each other. 
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Figure 10: A VARFIX frame with a variable start border, a fixed end border, and some 
intermediate borders specified relative to the start border or each other. 
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Figure 1 1 : Border relationship between a high-resolution time segment and a low- 
resolution time segment 
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Figure 12: The overall flowchart of the time border determination part of the invention 
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Figure 1 3 : The flowchart of the Forward Search (Type I) operation 
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The flowchart of the Forward Search (Type II) operation 
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Figure 15: The flowchart of the Backward Search operation 
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Illustration for the frequency resolution determination part of the invention 
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Figure 17: Employment of the three search operations in various parts of the four frame 

types. 
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Figure 18 : A tyical time/frequency grid representation for 
audio coding 
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5 ABSTRACT 



5. 1 Issue 

For subband coding based bandwidth extension methods such as the SBR, proper seg 
mentation in both the time and frequency direction is important to prevent low-e 
nergy areas from sharing the same average energy value with the large-energy are 
as. Otherwise, wrongful amplification might occur at the decoder, which might le 
ad to audible artefacts. 

5.2 Solution 

Like in the prior art, the frame type for the current SBR frame is determined ac 
cording to the type of end border of the previous frame, as well as the presence 

of a transient in the current SBR frame. The start border is determined accordi 
ng to the end border of the previous SBR frame. For a FIXFIX frame, a low time-r 
esolution setting is used. For a FIXVAR or a VARVAR frame, a search for intermed 
iate borders is conducted in the region between the transient and maximum allowe 
d end border location. The end border is also determined at this stage. If there 

is excess capacity for more borders, another search is conducted in the region 
between the transient and the start border. For a VARFIX frame, only one search 
needs to be conducted, in the whole region flanked by a variable start border an 
d a fixed end border. All of the above are accomplished with two Forward Search 
operations and one Backward Search operation. They employ the same principle, wh 
ich is based on evaluating the signal variation of a time segment, but with mino 
r variations to suit the scenarios in which they are applied. To determining the 

frequency resolution, a high resolution is first assumed and average energies a 
re computed for each frequency band. For every pair of frequency bands flanked b 
y the low- resolution borders, the ratio of energies is computed. If the minimum 
of all energy differences computed for the entire time segment exceeds a pre-det 
ermined threshold, a high-frequency resolution is adopted. Otherwise, a low-freq 
uency resolution is adopted. The method applies a stricter criterion for the ado 
ption of high frequency resolution in this region. 
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